Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix dill._dill submodule being saved as GLOBAL "dill._shims" "_dill" #490

Merged
merged 1 commit into from
Jun 8, 2022

Conversation

leogama
Copy link
Contributor

@leogama leogama commented May 31, 2022

I'm tracing some problems with my "portable" mode prototype and hit a bug with references to the _dill submodule saved as global:

>>> import dill
>>> dill.dumps(dill._dill, 0)
b'cdill._shims\n_dill\np0\n.'

The _dill submodule is saved as an attribute of the dill._shims submodule, which is valid, but it should be just dill._dill.

The _dill submodule is special-cased to be saved as global:

dill/dill/_dill.py

Lines 1807 to 1810 in 8e5e450

elif PY3 and obj.__name__ == "dill._dill":
log.info("M2: %s" % obj)
pickler.save_global(obj, name="_dill")
log.info("# M2")

But pickle.whichmodule misidentifies it as pertaining to dill._shims, because this entry is set earlier in sys.modules and it picks the first match.


The change fixes things for this especial case, but there is potential for new bugs related to other submodules:

>>> import dill, glob, importlib, os, pickle, pprint
>>> os.chdir(dill.__path__[0])
>>> modules = [mod.rpartition('.')[0] for mod in glob.glob('*py') if not mod.startswith('__')]
>>> modules = {mod: importlib.import_module('dill.' + mod) for mod in modules}
>>> pprint.pprint({name: pickle.whichmodule(mod, name) for name, mod in modules.items()})
{'_dill': 'dill._shims',
 '_objects': 'dill',
 '_shims': 'dill._dill',
 'detect': 'dill',
 'objtypes': 'dill',
 'pointers': 'dill',
 'settings': '__main__',
 'source': 'dill',
 'temp': 'dill'}

Note how _shims is attributed to dill._dill and settings is attributed to __main__(???). But currently they are not saved as globals.

I'm tracing some problems with my "portable" mode prototype and hit a bug with references the `_dill` submodule saved as global:

```python
>>> import dill
>>> dill.dumps(dill._dill, 0)
b'cdill._shims\n_dill\np0\n.'
```
The `_dill` submodule is saved as an attribute of the `dill._shims` submodule, which is valid, but it should be just `dill._dill`.

The `_dill` submodule is special-cased to be saved as global:
https://github.com/uqfoundation/dill/blob/8e5e450b9ed8dff164fc259c468365e2235c6581/dill/_dill.py#L1807-L1810

But `pickle.whichmodule` misidentifies it as pertaining to `dill._shims`, because this entry is set earlier in `sys.modules` and it picks the first match.

---

The change fixes things for this especial case, but there is potential for new bugs related to other submodules:
```python
>>> import dill, glob, importlib, os, pickle, pprint
>>> os.chdir(dill.__path__[0])
>>> modules = [mod.rpartition('.')[0] for mod in glob.glob('*py') if not mod.startswith('__')]
>>> modules = {mod: importlib.import_module('dill.' + mod) for mod in modules}
>>> pprint.pprint({name: pickle.whichmodule(mod, name) for name, mod in modules.items()})
{'_dill': 'dill._shims',
 '_objects': 'dill',
 '_shims': 'dill._dill',
 'detect': 'dill',
 'objtypes': 'dill',
 'pointers': 'dill',
 'settings': '__main__',
 'source': 'dill',
 'temp': 'dill'}
```
Note how `_shims` is attributed to `dill._dill` and `settings` is attributed to `__main__`(???). But currently they are not saved as globals.
@anivegesana
Copy link
Contributor

That behavior is bizarre. I guess in CPython, they assumed that every global would have a __module__ and they didn't allow modules to be pickled, so it worked out fine for them. Just not for us.

@mmckerns mmckerns self-requested a review June 5, 2022 09:51
Copy link
Member

@mmckerns mmckerns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mmckerns mmckerns merged commit 4462409 into uqfoundation:master Jun 8, 2022
@mmckerns
Copy link
Member

mmckerns commented Jun 8, 2022

Nice find. I can't wait to see what unintended consequences this has...

@mmckerns mmckerns added this to the dill-0.3.6 milestone Jun 8, 2022
@mmckerns mmckerns added the bugfix label Jun 8, 2022
@leogama leogama deleted the patch-1 branch June 8, 2022 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants